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Abstract —Ubiquitous sensing is tightly coupled with activity 
recognition. This survey reviews recent advances in Ubiquitous 
sensing and looks ahead on promising future directions. In 
particular, Ubiquitous sensing crosses new barriers giving us new 
ways to interact with the environment or to inspect our psyche. 
Through sensing paradigms that parasitically utilise stimuli from 
the noise of environmental, third-party pre-installed systems, 
sensing leaves the boundaries of the personal domain. Compared 
to previous environmental sensing approaches, these new systems 
mitigate high installation and placement cost by providing a 
robustness towards process noise. On the other hand, sensing 
focuses inward and attempts to capture mental activities such 
as cognitive load, fatigue or emotion through advances in, for 
instance, eye-gaze sensing systems or interpretation of body 
gesture or pose. This survey summarises these developments 
and discusses current research questions and promising future 
directions. 

Index Terms —Ubiquitous sensing. Activity recognition. Device- 
free, sentiment sensing. Pervasive Computing, RF signals, 

1. Introduction 

With the stark penetration by smart and mobile devices, 
we continuously carry sensors of all kinds with us, which 
monitor every location, situation and activity. More and more 
applications are exploiting these capabilities. Google Now, 
foursquare, Facebook, Twitter and others gather, analyse and 
exploit large amounts of instantaneous, personalised informa¬ 
tion. With this data, we can provide novel, intelligent and 
personalised services to the users. 

Development divisions in industry are currently exploring 
these possibilities, while research is evolving towards new 
frontiers; we see two main directions of this development: 

Parasitic sensing 

The parasitic utilisation of environmental, ubiqui¬ 
tously available sources in contrast to sensors on 
isolated, personal devices. 

Sentiment sensing 

Interpreting sensor information to recognize mental 
states, intention, attention emotion and cognitive 
activities of individuals. 

As depicted in figure in traditional Ubiquitous Sensing, the 
focus of the sensing system lies on the status of a mobile, 
personal device or sensors attached to an individual and on 
this individual’s directly observable actions (figure [T^. The 
environment (surroundings, crowd, situations) are typically 
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(a) Device- and individual-focused sensing of directly observable states 



(b) Parasitic- and Sentiment sensing 


Fig. 1. Classical and future Ubiquitous sensing paradigms 


not covered by personal device sensors. Consequently, the 
device is in a sense short-sighted with its perception limited 
to an isolated individual. However, considering a complete 
individual with her plans, emotions, intentions and mental 
states, classical sensing captures only the surface of that 
complex human system. Gradually, this focus is shifting to¬ 
wards the recognition of mental states, intention or emotion of 
individuals while increasingly environmental sensing sources 
are employed which combine zero installation cost with ubiq¬ 
uitous availability. Only recently, a special issue of the IEEE 
Pervasive Computing magazine focused on the recognition of 
attention via sensing modalities Q. 

As indicated in figure the new sensing paradigms extend 
the sensing range twofold: on the one hand, through the 
utilisation of environmental sources, additional and more fine- 
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grained information on situations and surrounding entities is 
available. On the other hand, additional information on mental 
states can be derived. 

Parasitic sensing utilises environmental, ubiquitous sensing 
sources such as, for instance, audio or radio frequency lO, 
a, a, a and thereby extends the perception of the sensing 
system beyond the boundaries of an individual device or per¬ 
son. Through the utilisation of stimuli from already installed 
infrastructure, coverage is maximised while installation cost is 
minimised. 

Sentiment sensing focuses on people’s mental state, in¬ 
tention or emotion, for instance, by interpreting eye-gaze 
information a, a, body gesture or pose a, a and thereby 
directs and extends the perception of a sensing system inwards. 

In this survey, we detail current advances towards parasitic 
and sentiment sensing and discuss open research challenges 
and promising future directions. 

II. Overview 

This section briefly sketches recent development that will 
foster and induce Parasitic and Sentiment Sensing. Then, in 
section ||n] and section ||Vj current advances in these directions 
are detailed before, in section |V| lively discussed topics and 
future directions are introduced. 

A. The route to Parasitic Sensing 

Over the last decade, we have seen remarkable progress in 
the recognition of human activities or situations uni, CD, 
ca, ca This was driven by several strong developments in 
related areas. First of all, sensing hardware has been greatly 
improved (e.g. size, accuracy and also new sensing modalities 
and sense-able quantities), enabling an enhanced perception of 
the world through sensors. Also, machine learning has cele¬ 
brated great successes (algorithms, toolboxes) and has become 
a mainstream ability that attracts a huge user base towards ac¬ 
tivity recognition. Furthermore, rapid development in wireless 
protocols and near-global coverage of some technologies (e.g. 
UMTS, LTE) enabled the transmission at higher data rates 
and new usage areas through wireless communication. Last, 
but not least, novel applications have spread that promote the 
publishing and sharing of all kinds of data (e.g. Facebook, 
Line, WhatsApp), which led to novel valuable inputs for 
activity recognition. 

Even given this progress and innovation already, the field 
is on the verge towards a disruptive next change that will 
revolutionise usage patterns and open a multitude of new 
research directions. 

Activity recognition in Ubicomp is going towards Big Data 
with systems developing capabilities to monitor virtually ev¬ 
erybody, everywhere and without specifically installing system 
components at any particular physical location. 

Fostered through the advancing Internet of Things and 
fueled by Opportunistic and Participatory Sensing campaigns 
(cf. figure [^, we have been able to follow this development 
in recent years. 

Opportunistic sensing has been viewed as one likely future 
of sensing ifTH . Distributed devices provide their sensing 



Fig. 2. Participatory and Opportunistic sensing paradigms 

capabilities to neighbouring devices, that are then empowered 
to access the remotely sensed information or to generate tasks 
for remote devices to acquire and share this information iTTSll . 
csi. This is a promising concept which greatly extends the 
perception of a mobile device to the joint perception of its 
neighbouring devices and environment. In the frame of the 
OPPORTUNITY project an architecture for opportunistic 
sensing, in particular activity recognition was developed uni, 
da. However, we did not see a broad application and utili¬ 
sation of Opportunistic sensing yet. 

Opportunistic Sensing rises a number of issues not only 
regarding the mere technical implementation, protocols, mo¬ 
bility and timing. It also touches aspects of privacy and 
security when alien devices are allowed to access potentially 
privacy-related personalised information in an uncontrolled 
manner ca, CD. In particular, the concept envisions that 
arbitrary sensors can be accessed so that, apart from the also 
tremendous challenge to enable the seamless interaction tech¬ 
nically, the design of a privacy or security preserving scheme 
is a nightmare which, with the sheer infinite possibilities and 
security threats posed by all the sensors, can hardly be solved. 

With the proposal of Participatory Sensing 1201 . the privacy 
issues of Opportunistic sensing are solved pragmatically. In 
this sensing principle, remote sensing is restricted to user- 
controlled mobile devices. Remote devices are still expected to 
task neighbouring devices for sensed information, but human 
interaction is required in order to approve such request CD. 
Consequently, not only is the range of devices restricted to ex¬ 
plicitly user-controlled devices with an interactive interface but 
also the important principle of calmness and unobtrusiveness 
in Pervasive Computing is disregarded. Instead, the mental 
load for a user with a Participatory Sensing Device is likely 
significantly increased as she might be frequently interrupted 
for interaction. 

However, these developments indicate the direction in which 
activity recognition and sensing as a whole develop. Instead 
of utilising device-bound sensors with limited range, future 
sensing will incorporate increasingly environmental sensing 
sources which have the potential to extend the perception of a 

^Opportunity Project website: http://www.opportunity-project.eu/ (Mai 
2014) 
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sensing device beyond its physical boundaries. Consequently, 
as discussed above, the reliance on explicit hardware sensors in 
the environment introduces communication overhead as well 
as technical, privacy and security-related problems. As long as 
there is no real incentive for device-owners to make sensors 
on their devices available to the public, they will rather choose 
to protect their security and privacy as well as their battery by 
granting exclusively local access to sensors on a device. 

A less problematic and yet simpler way to extend the 
perception of a mobile device into the environment is the 
utilisation of environmental stimuli that can be extracted 
from the noise of other systems. The parasitic usage of 
environmental noise has been demonstrated by infrastructure 
mediated sensing paradigms 1^ . fT2\ , audio-based ll23l and 
radio-frequency based approaches 1^ . (21 as detailed in 
section [In] We believe that the greatest potential underlies the 
RF-based systems since (A) RF is available ubiquitously (free 
frequency spectrum is sparse all over the world), (B) virtually 
all contemporary electronic devices incorporate an interface 
to the radio channel and (C) novel technical developments 
such as OFDM (cf. section |V]) incorporate properties that will 
likely lead to better recognition accuracies on cheap off-the- 
shelf consumer devices. 

This development is already under way with the community 
increasingly considering device-free techniques that relieve the 
monitored individuals from the burden of actually wearing 
any sensing hardware; and this evolution will continue in the 
direction of passive, device-free systems which exploit para¬ 
sitic operation by re-using noisy emissions from ubiquitously 
available, environmental third-party pre-installed technology. 

B. The Route to Sentiment Sensing 

Activity Recognition started out with detecting very simple 
physical states, walking, sitting, standing - modes of locomo¬ 
tion - in the 1990s. We came a long way from these simple 
classes to tracking a lot of high level activities, like car repair, 
furniture assembly and Kung Fu exercises (25l, (261. 

The dedicated sensor systems used in the labs were not 
easily deployable. Yet, this changed with the advent of the 
smart phone as general computing platform. Suddenly ’’cheap” 
motion sensors were available to everybody. Still using smart 
phones or other consumer devices brought also new chal¬ 
lenges. The position and orientation of the devices was no 
longer fixed. One had to cope with location and orientation 
changes of the sensors GU. 

Next we saw a push towards physiological sensing, first in 
the medical application domain then also for more and more 
sports and fitness research. 

Now, more and more people get interested in the brain and 
brain functions. We gather rich information in cognitive sci¬ 
ence, psychology medicine and related fields about cognitive 
processes. Therefore, we have now a sufficient basis to explore 
cognitive task tracking in everyday life (281. 

The first impacts are already visible in the medical domain. 
Here we see that sensor data from smart phones can predict 
depression episodes in patients with mental illnesses. Motion 
data seems to correlate well with some mental states. The 


same holds for the physiological data. Heart rate, blood oxygen 
level etc. can tell a lot about our cognitive condition especially 
combined with motion sensors (e.g. if a user doesn’t move 
much and his heart rate is increased, it could signify that he’s 
excited) (29l . 

Yet, more interestingly, there are a couple of sensor modal¬ 
ities to track brain activity (in)-directly and we see them more 
and more embedded in consumer devices (e.g. the emotive 
headset to track brain activity using EEG). 

It seems obvious to track brain activity directly using EEG 
or other brain imaging technologies. However, these technolo¬ 
gies have severe limitations; either they are quite expensive 
and bulky (e.g. magnetic resonance imaging) or they require 
heavy filtering and analysing. As our skull is quite thick, brain 
signals are easily overshadowed by motion artifacts etc. 

One promising alternative is to use eye tracking, as gaze is 
directly correlated to some of the higher brain functions. There 
are two common approaches. Optical eye tracking uses infra¬ 
red lights and camera to track the pupil. Electrooculography 
uses electrodes to track eye movements, as our eye is a 
dipole (30l . 

III. Device Free/ Radio Sensing 

Sensing modalities for activity recognition or monitoring 
differ in their installation effort and range (cf. figure [^. 
The figure summarises popular of these modalities and char¬ 
acterises them for device-bound and device-free (DF) ap¬ 
proaches. Within the device-free techniques, we observe a shift 
of attention towards the evaluation of environmental, measur¬ 
able quantities of pre-installed third-party systems which are 
cheap to use and with increasingly wider physical boundaries. 

Researchers have shown remarkable accuracy in tracking 
activities such as, among others, walking, running, cycling, 
climbing/descending stairs, sleep states and mobile phone 

usage (n, da, da. 

However, an implicit requirement of these sensing modali¬ 
ties is that the entity or individual to monitor has to cooperate 
and actually wear the device (device-bound). 

In contrast to this, for device-free approaches, the sensing 
modality need not be worn by the monitored subject. We 
can distinguish between classical systems installed particularly 
for a specific sensing task and systems which are parasiti- 
cally utilised for sensing but which are originally installed 
and utilised for other primary purposes. Classical device-free 
systems cover, for instance, video (55ll . (56l . infrared (581, 
(87l . pressure (6^ or ultrasound (63l . (64l sensors. A clear 
disadvantage of these approaches is their high installation 
effort. 

This effort can be mitigated by infrastructure-mediated 
sensing paradigms flTl . (22l . In general, the approach here 
is to utilise existing installations, for example, in homes or 
office buildings, for sensing purposes. For instance, pressure 
patterns in residential water pipes might indicate specific ac¬ 
tivities/usage of inhabitants CD, ca or also electromagnetic 
interference in various electric systems can be utilised to clas¬ 
sify activities (6^ . cni. However, these sensing capabilities 
are limited to indoor application and single buildings. 
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Device-bound 


Inertial sensors 

Accelerometer devices are becoming rapidly ubiquitous in modern day technology [sT], [21- Employed for a broad range of use cases from mere environmental 
adjustment of devices to the recognition of individual user’s situation[33], O], [3^. Multip le sensors instrumented at multi^body locations utilised to recognise 
different activities |^, [21. [S], [^. Other related sensors are vibration sensors 1^ . or magnetic resonant coupling 14^ . 

Biosensors 

Sensors to monitor the heart rate are employed to predict physical activity l4TI . l42l . [4^ . Popular in health related applications is also the monitoring of blood 
pressure |44] or electrocardiography EGG f4^ . [4^ . In addition, Electromygraphy(EMG) sensors are used to monitor the health status [47] or, e.g. facial EMG to 
support eye-gaze tracking sensors [S^. This sensor class is feasible to record muscle activity (surface EMG electrodes) [48], [49] 

RF-based 

Device localisation is possible by employing WiFi signal strength and signal-to-noise ratio [S^j, signal stren gth in formation from the active set at a GSM terminal [5l], 
Ea, or also via signal strength information of a set of signals received from nearby FM radio stations [5^. 1541. 


Device-free 


Installation-based 

Video 

Recognition of activities from video can reach 
remarkable accuracies [55]. Activities are identified 
via matching of templates, neighbour based or via 
statistica modelling [56], [53- Flowever, video has 
high installation cost, is strictly range limited, fails 
in darkness and may violate privacy. 

Infrared 

Capturing of radiated infrared waves emitted from 
objects. Infrared can be employed as imaging 
technology similar to video but with the benefit that 
human motion can be easily detected from the 
background regardless of the lighting conditions 
and colors of the human clothing and surfaces [5^ . 
|59]. The technique is limited in sensing range and 
requires careful and dense deployment. 

Pressure 

Pressure sensors typically exploit the change of 
conductivity due to deformation or expanding of 
wires and can be integrated in fiber of textiles f^ . 
[^ . They are utilised to track footsteps and 
locations of individuals as well as touch-interaction 
with the environment [^. Installation cost is 
typically high and requires careful deployment. 

Ultrasound 

Ultrasound can indicate relative location of a 
pair of devices via Time-Of-Flight (TOF) [^ . 
Accuracy can be improved via combination with 
radio frequency [64]. 

Depth camera 

Equipped with a depth camera and capable of 
voice interaction, the Kinect device is able to 
accurately track gestures of persons [65], [66] and 
interaction [67]. 


Infrastructure-mediated 

Exploitation of alternative sensing modalities which 
are pre-installed and readily available in environ¬ 
ments and therefore minimise installation cost. 

Resistance; inductive electrical load 

Alterations in resistance and inductive electrical 
load in a residential power supply system can be 
exploited to detect human interaction in a build¬ 
ing [^. Authors leveraged transients generated by 
mechanically switched motor loads to detect and 
classify human interaction from electrical events. 

Electromagnetic interference (EMI) 

Gupta et al. analysed electromagnetic interference 
(EMI) from switched mode power supplies (SMPS) 
in order to detect human interaction with electrical 
systems im. It is even possible to detect proximity 
of the human body towards a Fluorescent Lamp 
Utilizes from the change in impedance in the EMI 
structures [To] 

Water pressure 

Leveraging residential water pipes, the change 
in water-pressure within the pipe system can be 
utilised to classify water-related activities and their 
location in the house (flushing toilet, washing 
hands, showering,...) [TT], [23, [Z3- 

Gas consumption 

With a single sensing point, gas use can be identi¬ 
fied down to its source (e.g., water heater, furnace, 
fireplace) ED. The authors monitor the gas-flow via 
a microphone sensor. 

Electromagnetic noise 

Using electrostatic discharges from humans touch¬ 
ing environmental structure, it is possible to detect 
locations that have been touched and gestures 
from electromagnetic noise [75], [76]. 


Environmental / Parasitic 

Audio 

Audio can be utilised to identify the location of 
a phone on room-level and also various in-room 
(e.g. on table, in drawer) or on-body locations (e.g. 
pocket) [0]. Furthermore, audio-fingerprints can 
serve as a sense of proximity among devices[4]. 

Radio frequency 

Passive Radar describes a class of radar systems 
that detect and track objects (vehicles, individuals) 
by processing reflections from non-cooperative 
sources of illumination in the environment, such 
as commercial broadcast and communications 
signals (HF radio, UHF TV, DAB, DVB, GSM) E], 
[78l . In these systems, no dedicated transmitter 
is involved but the receiver uses third-party 
transmitters. It then measures the time difference 
of arrival between Line-of-Sight (LoS) signals 
and signals reflected from an object. By this it 
is possible to determine the bistatic range of an 
object and its heading and speed via Doppler Shift 
and its direction of arrival. Expensive systems can 
operate in ranges of several 100 km but are very 
expensive. 

Recognition of movement is also possible with 
simpler hardware (WiFi, Sensor nodes, Software- 
defined-radio) considering the interception of LoS 
paths between pairs of nodes [7^. In addition, 
highly accurate localisation was demonstrated by 
extracting the LoS components among a grid of 
nodes [^. Furthermore, it is possible with similar 
installations to distinguish activities and gestures 
(via Doppler fluctuations) [2], environmental 
situation [^ as well as attention levels (utilising 
changes in speed and direction as indicators) [^ 
and breathing rate [S^. 


Fig. 3. Overview over various Device-bound and Device-free sensing modalities in the literature 


This limitation is relaxed by systems that utilise environ¬ 
mental sources, such as radio frequency (RF) or audio CD, 

Ga. 

In the present survey, we focus on most recent developments 
in radio-based device-free-recognition. Such systems monitor 
changes observed on the RF-channel and analyse them for 
characteristic patterns. Changes in the location of objects 
or movement of individuals causes variation in the radio 
channel characteristics. For instance, due to blocked, damped 
or reflected paths of some of the signals superimposed at a 
receive node, the absolute signal strength might differ. Also, 
movement might induce Doppler shift in reflected signals 
and thus lead to changes in the distribution of energy over 
frequency bands at the receiver. Figure summarises relevant 
radio effects that can be exploited for environmental awareness 
from received RF signals. 


An early example of a system utilising WiFi signals for 
the localisation of a receive device is the RADAR system 
that employed signal strength and signal-to-noise-ratio (SNR) 
from WiFi ISOl . Other implementations utilised GSM for 
localisation by employing signal strength readings from the 
active set ED, ED or signal strength from a set of FM 
base stations ES, 1541 . Frequently, these approaches require 
the creation of a received signal strength (RSS) flngerprint 
map in, EOl, ED, but also real-time on-line localisation 
that does not require a flngerprinting map is feasible E21, 
EH, ca, EH, ca. The latter approaches combine, for 
instance, dead reckoning methods with characteristic, crowd 
identifled, waypoints for accurate relative localisation. These 
systems are device-bound and can reach high accuracy of 
about 1 meter 1^ . 

For device-free approaches, on the other hand, the mon- 
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Fig. 5. RF-based device-free activity recognition systems and their recognition capabilities and system configuration considered. The figure groups related 
the corresponding reference to reach system under the respective class. 


itored entity is not equipped with any transmit or receive 
device (791. We distinguish between four classes of such 
recognition systems conditioned on their hardware config¬ 
uration (cf. figure [^. These systems can be grouped into 
active and passive approaches conditioned on the presence 
of an active transmitter 1961 . Active systems control both, 
transmit and receive hardware while passive systems only 
utilise receive devices. Most current systems are active such 
that both, the receiver and the transmitter are under the control 
of the system. Generally, the classification accuracy of an 
RF-based device-free recognition system suffers when the 
transmitter is third-party controlled. 

Many early studies utilise continuous signals captured by 
Software-Defined Radio (SDR) devices for their more accurate 
and complete access to the radio channel. These systems can 
exploit continuous signals received on the wireless channel 
and sampled at a high frequency, which enables the utilisation 
of frequency domain features. 

In contrast, consumer devices seldom feature SDR- 
capabilities. On such devices, frequently, the Received Signal 
Strength Indicator (RSSI) is exploited as an indicator for 
surrounding activities and situations. 

Figure indicates research achievements demonstrated for 
the respective classes and system configurations by various 
groups. Most results have yet been achieved for active, con¬ 
tinuous signal based systems. On the contrary, passive RSSI- 
based systems are only recently considered. In addition, most 
work considers the recognition or localisation of individuals 
(presence, location). For continuous signal-based systems also 
more complex cases like activities have been considered. 


More complex system configurations or classes are to-day less 
frequently investigated and partly also constitute open research 
questions. 

The following sections detail the research conducted in these 
fields in more detail and also cover comparative measures like 
accuracy of recognition. 

A. Localisation 

Device-free RF-based recognition was first investigated for 
the task of localisation or tracking of an individual. Youssef de¬ 
fines this approach as Device-Free Localisation (DFL) in 1791 
to localise or track a person using RF-Signals while the entity 
monitored is not required to carry an active transmitter or 
receiver. 

In the following, we distinguish between preliminary studies 
considering basic impacts of presence and movement on a 
received radio signal, radio tomographic imaging approaches, 
RF-fingerprinting methods, anomaly detection methods and 
approaches that isolate direct links among nodes in order to 
analyse their fluctuation. 

1) Impact of presence and movement: Youssef et al. anal¬ 
ysed the impact of presence on a received radio signal and 
defined three tasks for DFL: detection of presence, tracking 
of persons and predicting identity of individuals l79l . For the 
mere detection of presence, they analysed the moving variance 
and moving average of the time-domain signal strength of 
RSSI values from transmitting and receiving pairs of WiFi 
devices (access points (AP) and mobile terminals). Classifica¬ 
tion accuracy reached up to 1.0 for some configurations. In 
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Effects on the radio channel 

Radio Frequency (RF) signals are electromagnetic waves, emitted approx¬ 
imately omnidirectional and approximately at speed c = 3 • 10® ^ from 
a transmit antenna. At a receiver, all incoming signal components Cj = 

add up to a received sum signal 


Csum = ^ m{t)e' 


j^TTfct 


E 


RSS, 


^3('ri+(Pi) 


(1) 


at a center frequency fc- We represent the received signal strength of signal i 
as RSSi and its shift in phase from signal generation and due to transmission 
delay by 7 ^ and 0 ^. 

In the following, we briefly describe radio effects that are relevant for device- 
free radio-based recognition systems. 


Multipath propagation 

Signals might be reflected and scatter at obstacles so that a transmitted signal 
Cj might reach a receiver via varies paths and with different signal delays. 

Signal fading 

These incoming copies of an individual signal Cj cause constructive and 
destructive interference at their superimposition at a receiver (fast fading). In 
contrast, slow fading occurs as a result of environmental changes that impact 
signal propagation (e.g. passing cars, moving trees) 


Blocking and damping of signals 

Conditioned on their frequency fi and the material encountered, a signal c is 
damped or even blocked by obstacles 


Doppler shift 

Relative movement between the transmitter and receiver incurs a change 
in frequency of a signal Cj- This Doppler shift fa is conditioned on the 
relative speed Vi between transmitter and receiver, the frequency fi and 
the angle ai of the movement direction between transmitter and receiver: 

fd = ^ ■ cos(ai) 

Path loss 

The signal strength of an RF-signal reduces with distance. A straight forward 
calculation of this path loss can be calculated by the Friis Free space 
equation [ 88 ] 


Prx = Ptx • ^ ^ ^ ^ • Gtx • Grx (2) 

Flere, Ptx describes the transmit signal strength, Gtx, Grx represent the 
antenna gain at transmit and receive devices, ^ describes the wave 

length and di is the distance traversed. The path-loss exponent r differs with 
the environment and typically takes values between 2 and 5. 


Fig. 4. Summary of some radio effects that can be exploited for RF-based 
Device-Free recognition 


order to track individuals they proposed the use of a passive 
radio map (see section [ni-A4| ). 

Kosba et al. presented in ED a similar system to detect 
motion from RF-readings of standard WiFi hardware. Their 
system utilises a short offline training phase in which no move¬ 
ment and activity is assumed as a baseline. Then, anomaly 
detection is employed in order to detect changes from that 
baseline. The authors considered mean or variance-related 
features and concluded that the variance is better suited to 
detect changes in the RSSI. In contrast to the works of Zhang 
and others, this system does not require WiFi nodes to be 
located in an exactly deflned grid with flxed node distances. 
Consequently, localisation is not possible but mere detection 
of presence. 

Also, Lee et al. consider the utilisation of RSSI fluctuation 
from pairs of communicating TelosB nodes for intrusion de¬ 
tection 19^ . In flve distinct environments (outdoor and indoor) 
they reported changes in the mean and standard deviation of 
absolute RSSI values. 

Utilising a passive, FM-radio based system with SDR de¬ 


vices, Popleteev indicated that frequency diversity can help 
to improve localisation accuracy of RF-based systems |[99l . 
In particular, he considered a person located at 5 different 
locations inside a room and predicted the location with a 
standard k-nearest neighbour approach. In addition, the author 
pointed out that the classifleation accuracy of the system 
would deteriorate when the system is trained on one day but 
classifleation is conducted on another day. 

Lieckfeldt and others considered the impact of the presence 
of a single individual on the received signal strength observed 
by an RFID reader in a 2mx2m area equipped with 69 
passive RFID tags IIOOI . Their system utilised a two-staged 
approach in which first the RSSI fluctuation without presence 
was recorded and later, presence was detected via the observed 
changes in the signal strength from the set of tags. The 
authors observed that the backward link is more expressive 
for the recognition of presence than the forward link from the 
reader. In addition they considered different orientations of the 
monitored individual in order to arrive at more general results. 

2) Radio tomographic imaging: Tomography desribes the 
visualisation of objects via a penetrating wave. An image is 
then created by analysing the received wave or its reflections 
from objects. A detailed introduction to obstacle mapping 
based on wireless measurements is given in ffon . Cna. Radio 
tomography was, for instance, exploited by Wilson et al. in 
order to locate persons through walls in a room 11031 . In 
their system, they exploit variance on the RSSI at 34 nodes 
that circle an area in order to locate movement inside that 
area. Nodes in their system implement a simple token-passing 
protocol to synchronise successive transmissions of nodes, 
these transmitted signals are received and analysed by the other 
nodes in order to generate the tomographic image by heavily 
relying on Kalman Alters. They were able to distinguish a 
vacant area from the area with a person standing and a person 
moving. In addition, it was possible to identify the location 
of objects and to track the path taken by a person walking at 
moderate speed. An individual image is taken over windows of 
10 seconds each. By utilising the two-way RSSI fluctuations 
among nodes, an average localisation error of 0.5 meters was 
reached lfT04l . 

It was reported in II 1051 that the localisation accuracy of 
such a system can be greatly improved by slightly changing 
the location of sensors, thus exploiting physical diversity. The 
authors present a system in which nodes are attached to disks 
equipped with motors in their center for rotation as depicted 
in figure With this setting it is possible to iteratively learn 
a best configuration (physical location) of nodes similar to, 
for instance, iterative beamforming approaches that try to lock 
several radio signals on the optimal relative phase offset C06l, 

Goa. 

Wagner et al. implemented a radio tomographic imaging 
system with passive RFID nodes instead of sensor nodes. 
Implementing generally the same approach as described above, 
they could achieve good localisation performance with their 
system. However, they had to implement a suitable scheduling 
of the probabilistically scattered transmissions of nodes due to 
the less controllable behaviour of passive RFID nodes 11081 . In 
later implementations, they improved their system to allow on- 
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Rotating disk, single sensor 
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Fig. 6. Illustration of the utilisation of an RF-sensors explo iting spatial 
diversity via a rotating disk or multi-node instrumentation 11051 


line tracking 1103 and a faster iterative clustering approach to 
further speed up the time to the first image generated mni. 
This image is then of rather low accuracy but is iteratively 
improved in later steps of the algorithm. With this approach, 
it was possible to achieve a localisation error of about 1.4m 
after only one second and reach a localisation error of 0.5m 
after a total of about seven seconds in a 3.5m^ area. 

Utilising moving transmit and receive nodes and com¬ 
pressive sensing theory BlllL II112L 11131 it is possible to 
greatly reduce the number of nodes required. For instance, 
Gonzalez-Ruiz et al. consider mobile robotic nodes that mount 
transmit and receive devices and circle the monitored target in 
order to generate the tomographic image ina. In particular, 
they required only two moving robots attached with rotating 
angular antennas in order to accurately detect objects in the 
monitored area. Each robot takes new measurements every two 
centimeters. Overall, after about 10 seconds a single image can 
be taken. They detail their implemented framework in ina 
and the theoretical framework for the mapping of obstacles, 
including occluded ones, in a robotic cooperative network, 
based on a small number of wireless channel measurements 
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3) Machine learning: Instead of generating radio- 
tomographic images, which is an accurate but comparatively 
slow procedure, also general Machine Learning approaches 
can be employed for RF-based localisation. For instance, 
Wagner et al. investigate the localisation in a passive RFID 
setting utilising multi layered perceptrons for training-based 
device-free user localization (na. In particular, the authors 
utilised a three-layer neural network that takes the a series of 
measurements as input vector and provides a tuple as output 
defining a two-dimensional user location. Localisation error 
achieved could be kept below 0.5 meters in a 3mx3m square 
area. 

4) RF-Fingerprinting: A common approach to RF-based 
localisation is the construction of radio strength maps. In 
device-based systems, RSS at various locations is tracked 
and used as a map together with access point IDs cm. 
With this information, location is later estimated from life 
measurements. Such radio maps may also be deployed with 
device-free systems in which the RSSI fiuctuations in the 
presence of a person not equipped with a transmit or receive 
device are captured. Youssef et al. present such a localisation 
system in im. They report that the RSSI is more stable 


Fig. 7. D evice -free RSSI-based localisation of objects via four distinct 
algorithms ET], (80) 


over night when no people are around so that this is the 
best time to create an RSSI fingerprint map. In a system 
with two transmit and two receive WiFi devices monitoring 
the RSSI in infrastructure mode from beacons sent roughly 
every 100ms they have been able to accurately predict and 
trac location of a single person in an indoor location. Later, 
they improved their approach using less nodes 11191 . This 
was possible by employing a Bayesian inference algorithm. 
All these experiments have been conducted under Line-of- 
Sight (LoS) conditions. A major drawback has been the 
time-consuming manual generation of the fingerprint maps, 
however, with current systems, also automated generation of 
RSSI fingerprints on laptop-class computers is possible 11201 

5 ) Geometric models and estimation techniques: Finally, in 
systems where the relative location of nodes that transmit and 
receive signals is exactly known, the geometry and layout of 
the instrumentation can be exploited. Zhang et al. employed a 
grid of nodes in order to localise individuals from device- 
free WiFi readings imi. They proposed a straightforward 
theoretic model to describe signal fluctuation induced by 
passive objects and verified their findings in a case study 
with ceiling mounted MICA2 sensor nodes transmitting with 
OdBm at 870MHz. The three algorithms proposed (Midpoint, 
Intersection, Best cover) all require an initial training phase in 
which the RF fluctuation is monitored in a stable state with 
no interference through individuals (cf. Frequency Selection 
Algorithm in figure [^. All algorithms utilise knowledge about 
the relative location of nodes and exploit RF-signal strength 
fluctuation on direct links. From this, center locations on the 
direct links. Intersections of direct links or 0.5 x0.5m^ areas on 
the direct links are utilised in order to predict the location of 
activity. Best results have been achieved with the consideration 
of overlapping areas. The optimum distance among two nodes 
































in the grid has been experimentally derived as 2 meters. With 
this configuration, a single person moving slowly (0.5 m/s) 
along a straight line has been tracked with an accuracy of be¬ 
low Im and two persons with an accuracy of below 2m. With 
additional clustering of nodes, the accuracy for the tracking of 
multiple persons could be further improved to slightly more 
than Im II122L Also, the transmission power was demonstrated 
to impact the tracking accuracy and lower transmission powers 
of —6 to —11 dBm have been observed to show more dynamic 
values for short node distances. The system was shown to be 
real-time capable in |[80l . By clustering the measurement area 
into several, frequency-separated cells, spanned by three nodes 
each, the authors could isolate interference from neighbouring 
nodes and also speed up the computation (cf. figure [7]). 

Utilising passive RFID transponders, Lieckfeldt et al. ex¬ 
ploited device-free Localization in recent years (ml, (ml. 
The authors propose a physical model that depicts the effect 
of relative position of subjects on the signal strength. They 
propose estimators for user localization, based, for instance, 
on maximum likelihood and geometric methods, such as 
nearest intersection points. While the geometric approaches 
suffer from a low accuracy, the estimation based methods are 
characterised by a high computational complexity. 

A straightforward approach to localisation based on RSSI 
fluctuation is the consideration of the interception of LoS 
paths in a grid of nodes. A first step in this direction was 
taken by Patwari et al. who derived a statistical model for 
the RSS variance as a function of the location of a single 
individual nm. They could show that refiection causes the 
RSS variance contours to be shaped approximately like Cassini 
ovals. They also considered the simultaneous localisation of 
multiple individuals at the same time and argue that their 
model could be extended to cover multiple individuals. This 
was later demonstrated to be feasible in an actual system 
instrumentation by Zhang and others lfT 26 t . The authors isolate 
the LoS path by extracting phase information from the differ¬ 
ences in the RSS on various frequency spectrums at distributed 
nodes. Their experimental system is with this approach able 
to simultaneously and continuously localise up to 5 persons 
in a changing environment with an accuracy of 1 meter. 


B. Recognition of activities 

Not only static location but also activities, gestures or 
situation in proximity of a receive antenna can be distinguished 
from signal fluctuation over time. For RF-based activity 
recognition, a higher sampling frequency is required than 
for mere localisation or tracking. Depending on the specific 
application, sampling rates between 4Hz and 70Hz are utilised. 
Consequently, methods such as tomographic imaging are too 
slow to achieve reasonable accuracy here. Furthermore, as 
location is not the main interest, geometric models and RF- 
fingerprinting are not employed. Especially the latter captures 
static situations and can therefore not be applied for the 
recognition of dynamic changes over a time window. 

Instead, machine learning techniques are frequently applied 
to analyse fiuctuation in signal strength measurements over 
time. In addition to RSS, also movement-indicating features 
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Fig. 8. Re cognition of three well separated classes in the SenseWaves 
system (ml 


such as frequency-domain features or Doppler shift are ex¬ 
ploited. 

1) Machine learning and estimation: In their seminal work, 
Patwari et al. report that they are able to detect the breathing 
rate of a single individual by analysing the RSS fiuctuation in 
received packets from 20 nodes surrounding the subject tSSll . 
Via maximum likelihood estimation, they were able to estimate 
the breathing rate with a Root-Mean-Square-Error (RMSE) 
of 0.3 breaths per minute. Their system consists of Telos B 
nodes transmitting every 240ms on a center frequency of 2.48 
GHz, which translates to an overall packet transmission rate 
of about 4.16Hz. Prediction was taken after a 10 second to 60 
second measurement period. Best results could be achieved 
with 25 to 40 seconds whereas longer observation periods did 
not further improve the accuracy significantly. Naturally, the 
accuracy achieved was dependent on the number of nodes that 
participated. While a single node pair could not achieve usable 
results, already with 7 network nodes, an RMSE breathing rate 
error of only about 1.0 was observed. They could further show 
that the links with low average RSS are most significant for 
the detection of breathing rate. 

With standard machine learning approaches (e.g. k-nearest 
neighbour, decision tree, Bayes, support vector machines), it 
is possible to extract further information on environmental sit¬ 
uation from RSS fiuctuation. In preliminary studies, Reschke, 
Scholl, Sigg and others demonstrated the detection of opened 
or closed doors, presence and crowd size with an accuracy of 
0.6 to 0.7 (na, (121, (123, tIM (figure 0 illustrates the 
SenseWaves recognition system for the distinction of three 
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fairly separated classes). 

The authors utilised USRP Software defined radio de¬ 
vices (SDR^ from which one constantly transmits a signal 
at frequencies between 900MHz to 2.4GHz that is read 
and analysed by other nodes. The SDR devices allow high 
sampling rates of the observed signal. In their system, the 
authors employ sampling rates of 40Hz from a continuous 
signal transmitted by one node. No specific relative placement 
of nodes was required so that the system qualifies for ad- 
hoc deployment. For recognition, simple time-domain RSS 
features such as the Root of the Mean Squared (RMS), 
Average Magnitude Squared (AMS), Signal-to-Noise Ratio 
(SNR) 11281 . 11291 . I13QI . signal amplitude, signal peaks 
in a defined time period and the number of large deltas in 
successive signal peaks czu have been utilised. Also, the 
consideration of crowd size extends the often followed single¬ 
individual sensing approach (ED. The author’s learning ap¬ 
proach is able to predict the count of up to 10 stationary or 
moving individuals. 

Later, with the consideration of additional and also fre¬ 
quency domain features, recognition accuracy was further im¬ 
proved 11321 . 11331 . In addition, the authors compared several 
device-free recognition techniques and also accelerometer- 
based recognition with the result that the active and passive 
device-free and continuous signal based systems could score 
similar results as accelerometer-based recognition systems. 
The authors also reported that some features such as the 
variance are robust against static environmental changes for the 
detection of dynamic activities, such as walking or crawling. In 
addition, it was possible to distinguish activities conducted by 
multiple persons simultaneously in an active SDR-based sys¬ 
tem. With two persons conducting activities at two locations 
and four receive devices, the authors trained the classifiers 
on the combined features and could distinguish 25 cases with 
high accuracy Cm! (cf. figure 1^. Later, the recognition of 
gestures in the proximity of a receive antenna was reported 
with a similar approach CSl. 

In a related work with an SDR-based but passive system, 
Shi et al. exploited signals from a nearby FM radio station 
for the detection of activities. Their method also exploits 
machine learning approaches but relies more on frequency 
domain features. In addition, their sampling rate is lower with 
about 2Hz and a sampling window of 0.5 seconds Una, (ml, 
(ml. However, the accuracy achieved is comparable to the 
above active systems. 

Another approach utilising RSSI information from sensor 
nodes in an active RSSI-based system was presented in 11391 . 
The authors place eight 802.15.4 nodes that transmit at 2.4 
GHz in a 20m^ office room. The nodes were placed at various 
heights from 30cm to L4m. With this setting and only mean 
and variance as features, the authors could distinguish seven 
different classes at an accuracy that exceeded the accuracy 
achieved by an accelerometer attached to the subject for 
comparison. They reported that their 3D topology helps to 
distinguish activities and that there are indications that dis¬ 
crimination of subjects might also be possible. 

^ http: //WWW. ettus.com 




Fig. 9. Constellations of 1, 2, 3 or all rec eiver s for the simultaneous 
recognition of activities from multiple subjects fT34l 


Very recently, Sigg et al. investigated the distinction of 
gestures and situations in a passive device-free system with 
only one off-the-shelf (smartphone) receiver ifMOt . (H. They 
observed that 10 RSSI packets per second could be expected 
in urban places and that these are sufficient to distinguish 
between simple classes and also hand gestures in proximity 
of the receiver. Although their accuracy reached was lower 
than for the active RSSI-based system reported above, it was 
clearly above random guess. In addition they could distinguish 
11 gestures performed in close proximity of the phone. 

2) Doppler Shift: When an object refiecting a signal wave 
is in motion, this causes Doppler Shift. The direction and speed 
of the movement conditions the strength and nature of this fre¬ 
quency shift. Pu and others showed that simultaneous detection 
of gestures from multiple individuals is possible by utilising 
multi-antenna nodes and micro Doppler fiuctuations O, II141L 
They utilise a USRP SDR multi antenna receiver and one or 
more single antenna transmitters distributed in the environment 
to distinguish between a set of 9 gestures with an average 
accuracy of 0.94. Their active device-free system exploits a 
MIMO receiver in order to recognise gestures from different 
persons present at the same time. By leveraging a preamble 
gesture pattern, the receiver estimates the MIMO channel that 
maximises the reflections of the desired user. 

A main challenge was for them that the Doppler shift 
from human movement was several magnitudes smaller than 
the bandwidth of the signal employed. The authors therefore 
proposed to transform the received signal into several nar¬ 
rowband pulses which are then analysed for possible Doppler 
fluctuation. The group discussed application possibilities of 
their system in 1114211 . 

In a related system, Adib and Katabi employ MIMO in¬ 
terference nulling and combine samples taken over time to 
achieve a similar result while compensating for the missing 
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Fig. 10. Gestures recognised via RSSI fluctuation on an off-the-shelf mobile phone El 


spatial diversity in a single-antenna receiver system Gl. In 
their system, they leverage standard WiFi hardware at 2.4GHz. 

Later, this work was extended to 3D motion tracking by 
utlising three or more directional receive antennas in exactly 
defined relative orientation 11431 . In particular, the system 
is able to track the center of a human body with an error 
below 21cm in any direction and can also detect movement 
of body parts and directions of a pointing body part, such 
as a hand. This localisation is possible through time-of- 
fiight estimation and triangulisation. Higher accuracy of this 
estimation is granted by utilising frequency modulated carrier 
waves (sending a signal that changes linearly in frequency with 
time) over a bandwidth of 1.69GHz. Impact of static objects 
could be mitigated by subtracting successive sample means 
whereas noise was filtered by its speed of changes in energy 
over frequency bands. 

IV. Toward Cognitive Activity Recognition 

Physical activity tracking came a long way, from dedicated 
sensing devices in lab settings to consumer applications em¬ 
bedded in wearable appliances (e.g. Fitbit, Jawbone UP) and 
even dedicated human motion tracking co-processors in smart 
phones (e.g. M7 in the iPhone 5s). Now we are seeing the first 
end consumer devices that start exploring our physiological 
signals (heart rate, blood oxygen level etc.) and our sleep 
performance. 

The next logical step is the tracking of cognitive activi¬ 
ties: attention, recall, cognitive load and finally learning and 
decision making. We explore in this section which sensor 
modalities seem to have the most merit and then tackle a very 
specific type of cognitive task, namely reading. We discuss 
why reading is a good choice to start with and how we tracking 
can be extended towards other cognitive activities 1^ . 

A. Importance of Eye Gaze 

The most obvious way to track cognitive tasks is to monitor 
the brain directly. Although this approach sounds promising, 
there are a lot of practical problems with direct brain moni¬ 
toring. Either the methods are very obtrusive (e.g. fMIR) or 
they have problems with noise, movement artifacts and are not 
easy to wear during everyday life. 

As intermediate technology, eye movement tracking seems 
to be the most promising. As it can be easily monitored either 
using optical eye tracking or electrooculography (electrodes 
placed close to the eye). Also, eye movements are closely 



Eig. 11. User reading a document with head-mounted eye tracker. 

correlated to cognitive activities and states. From simple blink 
frequency analysis that can tell you about the fatigue of a user 
to expertise analysis for complex visualizations. 

B. Quantifying Reading Habits 

In this section, we focus on tracking reading as a cognitive 
activity. There are two main reasons. First, reading is a 
ubiquitous task, performed everyday crucial to our learning 
and knowledge acquisition. Although there are very detailed 
studies of reading activities in the lab, there are very few in- 
situ studies about reading behavior in real life. Second, we 
believe ’’reading” can become to cognitive activity tracking 
what ’’walking” and locomotion analysis became for the 
physical task tracking. Reading analogous to Walking is easy 
to define and includes repetitive movements with distinct 
frequencies. This should make the task of spotting it easier, 
while preventing the definition problem. Take ’’focusing” or 
’’paying attention” as an example, for spotting cognitive activ¬ 
ities depends highly on how you define them. 

To track reading habits we evaluated a couple of tech¬ 
nologies (e.g. EEG, eye tracking, motion sensors, egocentric 
cameras) and found mobile eyetrackers are so far the best 
suited for the task (see Eig. m for an exemplary setting with 
a person wearing a mobile eyetracker). Our analysis goes from 
simple word count over reading material inference to trying 
to assess reading comprehension. 

1) Word Count and Reading Speed: It is possible to imple¬ 
ment a wordometer using optical mobile eye tracking lll44l . 
The number of words a user reads can be counted by recog¬ 
nizing reading, counting line breaks and then approximating 
the words read. Current implementations works with an error 
rate of around 6-11% for 10 users over 20 documents with 
sizes ranging from 150-680 words. 
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The recognition process works as follows. First, reading is 
recognized by a support vector machine using fixation and 
saccade features cm. Afterwards, there are several ways to 
estimate reading volume: using time only, detecting a line 
break (long saccade back towards a new line) to estimate lines, 
based on the lines or the word count. The latter method works 
better (5-15% lower error rate). 

Reading volume in itself is associated with an increase in 
vocabulary and there are strong correlations between size of 
vocabulary and language skill. However, more interestingly, 
reading volume seems also an indicator for higher general 
knowledge Ea. In itself, reading volume is therefore already 
interesting information. Yet, it can also enable novel applica¬ 
tions, like annotating books with the amount of reading a user 
did (and at which pages) to give feedback to authors. 

2) Document Type Classification: Using also a mobile 
eyetracker, it is possible to tell which documents a user reads. 
In figure exemplary eye-gaze patterns are displayed for 
various document types (comic book, text book, magazine 
etc.). In an experiment with 10 users reading 5 different 
document types for 10 minutes in 5 different environments 
(e.g. office, coffee shop) an accuracy of 78% for around 
1 minute windows are achieved independent of the user and 
98% for the user dependent case. As long as the document 
layout is sufficiently unique, information about the document 
is also contained in the eye movement (Hi. 

This raises the interesting question, if given a particular 
goal, there are optimal eye gaze patterns for reading a par¬ 
ticular document. If this were the case, we could store the 
optimal eye gaze pattern and adjust the document accordingly 
if the user deviates from that pattern. 

3) Toward Reading Comprehension: In the same line of 
research yet even more difficult, researchers assess whether it 
is possible to estimate expertise level from eye gaze. 

So far, the results are ambiguous regarding the estimation of 
reading comprehension. Although there is a clear correlation 
between a couple of eye gaze features and the comprehension 
of the reader, the data seems noisy making a good inference 
difficult. We can detect difficult words by using fixation counts 
for individual users, yet so far it was not possible to determine 
reading comprehension directly 1^ . Difficult word detection 
is based on fixation count. Difficult words have a statistically 
significant increase in fixations. 

C. Augmenting Reading 

As a first step to explore reading comprehension more, it 
is evaluated if and how implicit text annotations using eye 
gaze can support second language learners and their teachers. 
Starting with giving readers quantified feedback about their 
behavior, answering simple quantitative questions: How fast 
do they read a paragraph? How much re-reading do they do? 
Yet, finally the aim is to give the reader feedback about their 
concentration and finally text comprehension level. 

The current focus is set on paragraph based annotations, as 
these already can give valuable support to the learner and are 
feasible to implement with current technology. The initial set 
of annotations are inspired by lab internal discussions and by 
related work C47l. 


In a prototype implementation, reading speed is highlighted 
by background color and intensity. Slow speed with darker 
hue, faster speed with lighter hue. Reading speed is given by 
how long the participants eye gaze is in a paragraph region. 

The amount of re-reading is estimated by comparing the 
line count of the paragraph with estimating the line count by 
using eye gaze using a method from Kunze et al. on. The 
amount of re-reading is shown by an arrow pointing back up 
(cf. figure p3]). 

Fixations are aggregated in larger fixation areas applying 
a filtering method from Busher et al. 11471 . The number of 
fixation areas are shown as a eye icon next to the paragraph. 

In Figure 13 we depict these annotations for a document 
read by students with good and poor English skills. The 
good student performs less re-reading and has in general a 
fast reading speed. Although the differences between the two 
participants are easy to see, eye gaze is not only infiuenced 
by our expertise level, but also from fatigue and other mental 
states. Therefore, it’s difficult to give comprehensive evalua¬ 
tions. Moving away from reading, we can also use cognitive 
activity recognition for implicitly tagging objects and events. 


D. Cognitive Tracking for the Masses 

A major problem of studies on cognitive activity recognition 
is that it is very difficult to make them representative, as 
sample sizes are relatively small (10 -20 participants). Prob¬ 
lems also cover the activity recognition field in large and 
other information technology fields addressed. Dealing with 
cognitive tasks, this however is of additional weight. As seen 
from similar cognitive science and psychology studies, very 
large sample sizes are needed to assess the relations between 
tasks and cognitive activities, especially related with complex 
processes like learning. 

One way to approach this problem is to provide afford¬ 
able commodity devices to enable contributions from people 
towards the questions of intelligence amplification. 

As eye trackers are still expensive and some people might 
not want to wear glasses, we should focus on alternative tech¬ 
nologies that are already available or can be easily integrated 
into consumer devices, to enable cognitive task tracking. 

Additionally, head-mounted display computers, most promi¬ 
nently Google Glass, seem to get more and more commercial 
attention. They are a perfect tool for cognitive task analysis, 
as they are already worn on the head. A very simple sensor 
(infrared distance sensor from Google Glass) can for example 
measure eye blinks. Astonishingly, blinking frequency alone 
is already able to distinguish a couple of cognitive tasks (e.g. 
Reading versus Talking to a person, see Fig. [g frm . 

V. Discussion and Future directions 

Activity recognition will increasingly focus on Parasitic 
and Sentiment Sensing paradigms. For device-free RF-based 
recognition, we expect that the diversity of sensors on devices 
can be greatly reduced as RF- and other environmental sources 
are capable to replace more specialised sensors with acceptable 
accuracy. This will result in a simpler and thus cheaper design 
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Fig. 12. Examples for different eye gaze patterns for varying document types (Texbook, novel, magazine, newspaper, manga) 
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Fig. 13. Eye-gaze annotated document for a participant with low English skills (first four paragraphs) and higher skills (second four paragraphs). First we 
show the raw eyegaze as recorded by the eyetracker, then the annotated document. Shading shows the reading sp eed: t he darker the slower. The arrows on 
the right show the amount of re-reading and the size of the eye next to the paragraph the number of fixation areas 1 1491 . 
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Fig. 14. Blinking frequency recorded with Google Glass for reading (top) 
and talking to a person (bottom). 

of consumer appliances with more accurate specialised sensing 
hardware reserved for professional devices. 

Sentiment Sensing will receive considerable attention over 
the course of the next couple of years. The knowledge on 
mental states will breed a number of new applications and 
challenges. 

In addition, we expect that these sensing paradigms will 
increasingly be applied on non-expert off-the-shelf consumer 
hardware. This development will foster a wide adaptation 
of these sensing paradigms and enable a number of novel 
applications as well as revenue for companies. 

A. Environmental conditions 

Since parasitic sensing exploits environmental sensing 
sources, it is suggestive to monitor environmental conditions 
with such signals. The sensing of traffic situations from 
environmental parasitic sources is gaining increased attention 
and might be fuelled also by vehicular communication, au¬ 
tonomous driving and pedestrian safety campaigns. But also 
other measures like, for instance, temperature can be sensed 
parasitically from RF. 


1) Temperature: As detailed in CSD, the outside tem¬ 
perature impairs the capability of WiFi equipment, which 
might greatly reduce its transmission range. By inversion 
of the same argument, the range of WiFi equipment will 
allow conclusions on the surrounding temperature. While it is 
difficult to estimate the distance between a WiFi accesspoint 
(AP) and a wireless receiver directly, utilising changes in 
signal strength information from multiple APs should enable 
accurate prediction of environmental temperature. 

2) Sensing traffic situations: Electromagnetic emission can 
be detected from a number of entities, including car en¬ 
gines. Regulation by EMC requires that emission from com¬ 
bustion engines fulfills strict requirements in the 30-1000 
MHz range 11521 . But also for alternative power train road 
vehicles similar requirements apply ED. These emissions 
are tested with standardized radiated emission tests such as 
CISPR 12 ES or CISPR 22 El- 

In 11551 it has been shown how RF emission from car’s 
engines can be utilised in order to detect various car models. 
The authors have been able to distinguish between three car 
models with an accuracy of 0.99 with the help of an Artificial 
Neural Network-based classifier. For this, the ignition spark 
was the most characteristic event. The characteristic features 
were identified over a frequency range of 2.5 GHz. 

Kassem et al. ED sense traffic situations by tracking 
frequency and speed of passing cars that intercept the di¬ 
rect line of sight between a pair of nodes on both sides 
of the road. Furthermore, Ding et al. demonstrated, how 
emissions from car engines can be utilised for passive traffic 
awareness utilising either roadside installations or also in- 
car modules ED, EZI- The authors have employed standard 
machine learning approaches in order to distinguish six traffic 
situations from roadside measurements and, in addition, the 
own-vehicule’s speed with in-car measurements. Recognition 
accuracy achieved in realistic environments were above 0.96 
in all cases. Possible further applications include the detection 
of traffic jams or also the number of cars waiting in front of 
a traffic light. 

B. Sentiment and mental states 

As detailed above, sentiment and mental states are on the 
verge to being recognised from environmental and on-body 
sensors. 

1) Emotion: Emotion can be inferred from body gesture 
and pose m at least as accurately as from face ifTssl . EH, 
EQ). The role of human body in emotion expression has 
received support through evidences from psychology 11611 
and nonverbal communication 11621 . The importance of bodily 
expression has also been confirmed for emotion detection 

csi, cmi, csa. 

Walter and Walk 11611 revealed that emotion recognition 
from photos of postural expression, in which faces and hands 
were covered, was as accurate as recognition from facial 
expression alone. Dynamic configurations of human body even 
hold greater amount of information as indicated by Bull in 
11661 . He proved that body positions and motions could be 
recognized in terms of states including interest/boredom and 
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agreement/disagreement. Some other studies went further by 
looking for the contribution of separated body parts to partic¬ 
ular emotional states CSil, 11681 . Emotion can be recognized 
from non-trivial scenarios, such as simple daily-life actions 
I169L 11701 or recognition ability of infants 11711 . 

It will be interesting to see how well RF information can 
be exploited in order to identify body gesture and pose and to 
classify this for human emotion classes. 

2) Attention: Attention is an important measure in 
Computer-Human interaction. It determines for an interactive 
system the potential to impact the actions and decisions taken 
by an individual 11721 . The same action of the same system 
might be considered either as annoyance or be appreciated 
as helpful depending on whether the individual was focusing 
part or all of her attention towards the system or not. In the 
literature, we find various definitions that classify attention 
as well as its determining characteristics Ea, Ell. While 
the tracking of gaze is a commonly utilised measure of atten- 
tion ES, also other observable features may indicate atten¬ 
tion. In general, aspects such as Saliency, Effort, Expectancy 
and Value are important indicators of attention E61 . El, 
El, iflTTl . This model was later extended to put a greater 
stress on the effort a person takes towards an object 11781 . 
The authors also discuss various aspects of attention and 
identify as most distinguishing factors changes in walking 
speed, direction or orientation. 

In l(8^ it was investigated, how these properties, in par¬ 
ticular location of a person, walking direction and walking 
speed or changes therein can be utilised for the monitoring 
and detection of attention. This was yet a preliminary study 
which lacked generalisation and high accuracy but we will see 
further improvements of attention recognition via RF soon. 

C. Enhancing Recall and Focus 

Successfully tracking tasks, like emotion or attention en¬ 
ables us to improve our cognitive abilities. The ultimate goal of 
research conducted in these directions is to improve memory, 
concentration and finally decision making. 

If we can track attention levels and cognitive load, we can 
identify the best times for the user to relax, learn, study or 
engage in spare-time activities, depending on their current 
cognitive state. 

D. Device-free RF-based recognition on consumer hardware 

Currently, RF-based device-free recognition from 
continuous-signal based devices (such as e.g. SDR-nodes) 
can be considered as solved. Future directions are towards 
the recognition on consumer devices. With the introduction 
of OFDM to many wifi-class devices, some of the features, 
of SDR nodes, such as utilisation of multipath information 
can be incorporated from OFDM channel state information. 
For WiFi-based indoor localisation, this has already been 
employed recently to achieve sub-meter accuracy insi. 
In contrast to RSSI, the CSI contains channel response 
information as a PHY layer power feature 11801 . Therefore, 
it becomes possible to discriminate multipath characteristics 
which hold the potential for more accurate classification 



Fig. 15. Multipath information contained in CSI PHY layer features in 
contrast to plain RSSI 


of activities from RF. The utilisation of channel response 
was before recently only possible with sophisticated SDR 
hardware 11811 . 11821 . 11831 . With introduction of Orthogonal 
Frequency Division Multiplexing (OFDM) for WiFi 802.11 
a/g/n standards, this has, the channel response can now 
partially be extracted from off-the-shelf OFDM receivers, 
revealing amplitudes and phases of each subcarrier ED- 
While RSSI is not able to capture the multipath effects 
in an environment, as depicted in figure the channel 
response available via CSI possesses finer grained frequency 
resolution and higher time resolution to distinguish multipath 
components. 

Apart from this straightforward future research direction 
(which is already approached to-date by several groups), we 
can identify also more specific open research questions as 
follows. 

1) Empowering WiFi access points: Authors have demon¬ 
strated the detection of several situations (for instance presence 
or crowd size) from RSSI information on a mobile phone 1^ . 
More interesting even is the estimation of crowd size or 
presence at a WiFi AR At the access point, the incoming 
packets originate from multiple devices at multiple locations. 
In addition, traffic from an individual mobile device is typ¬ 
ically much lower than the traffic generated by an AP. It is 
not a-priori clear whether the snippets of RSSI-samples from 
distinct mobile devices are sufficient to estimate classes like 
crowd size or presence at a WiFi AP. In particular, analysis 
of the evolution and fluctuation of the average RSSI level as 
well as normalisation of incoming fiows regarding their signal 
strength might help to acquire such information. 

2) Activity recognition from 3G and 4G signals: In 1241 it 
was demonstrated how RSSI information from WiFi traffic can 
be utilised to identify environmental situations and gestures 
conducted in the proximity of a WiFi receiver. Similarly, it 
will be possible to utilise 3G or 4G signals for the distinction 
of similar classes. For this, however, the first step is the 
modification of the firmware for the 3G or 4G interface to 
allow access to signal strength information at higher frequency 
as this was done for the WiFi interface in El. 
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VI. Conclusion 

In this survey we have discussed recent advances in activity 
recognition which are leading towards two emerging sensing 
paradigms, namely Parasitic Sensing and Sentiment Sensing. 
Both are fostered by the extreme increase in sensing devices 
in people’s environments. While classical sensing on mobile 
devices covers the surface of an individual’s actions, namely 
her directly observable conditions, actions, movement and ges¬ 
tures, future sensing paradigms extend the reach of a device’s 
perception. Parasitic Sensing utilises noise of environmental, 
pre-installed systems and thereby captures stimuli from a 
device’s near to mid-distance surroundings. On the other hand, 
sentiment Sensing reaches inwards, focusing on mental state 
and sentiment. We see great potential for novel applications 
and revenue in both these paradigms. 

Parasitic Sensing is fostered by the rise of the Internet 
of Things which will deploy a multitude of sensing and 
communicating devices in the environment. In particular, these 
devices will feature an interface to the RF-channel which is 
why we envision this as the one universally employed sensor 
on such devices. Apart from the already existing RF-noise 
to-day, loT devices will generate significant additional traffic 
to transform the RF-interface into a rich sensing source for 
environmental activities. 

Sentiment Sensing in contrast benefits from a hype in 
novel body-worn devices, such as instrumented glasses or bio¬ 
sensors in a number of appliances. Eyetracking is a rich source 
for the detection of a number of mental activities such as 
reading or also for the monitoring of attention or, for instance, 
fatigue. Already today, products are announced which target 
this novel field of sensing This will open new insights for 
applications and enable new fields of assistance for mobile 
devices. 

We expect these sensing directions to fiourish over the 
coming years and thereby to advance ubiquitous and pervasive 
sensing to new borders. 
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